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Abstract We derive the mean-field equations arising as the limit of a network of 
interacting spiking neurons, as the number of neurons goes to infinity. The neurons 
belong to a fixed number of populations and are represented either by the Hodgkin- 
Huxley model or by one of its simplified version, the FitzHugh-Nagumo model. The 
synapses between neurons are either electrical or chemical. The network is assumed 
to be fully connected. The maximum conductances vary randomly. Under the con- 
dition that all neurons' initial conditions are drawn independently from the same 
law that depends only on the population they belong to, we prove that a propa- 
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gation of chaos phenomenon takes place, namely that in the mean-field limit, any 
finite number of neurons become independent and, within each population, have 
the same probability distribution. This probability distribution is a solution of a set 
of implicit equations, either nonlinear stochastic differential equations resembling 
the McKean-Vlasov equations or non-local partial differential equations resembling 
the McKean-Vlasov-Fokker-Planck equations. We prove the well-posedness of the 
McKean-Vlasov equations, i.e. the existence and uniqueness of a solution. We also 
show the results of some numerical experiments that indicate that the mean-field 
equations are a good representation of the mean activity of a finite size network, even 
for modest sizes. These experiments also indicate that the McKean-Vlasov-Fokker- 
Planck equations may be a good way to understand the mean-field dynamics through, 
e.g. a bifurcation analysis. 

Keywords mean-field limits • propagation of chaos • stochastic differential 
equations • McKean-Vlasov equations • Fokker-Planck equations • neural networks • 
neural assemblies • Hodgkin-Huxley neurons • FitzHugh-Nagumo neurons 

Mathematics Subject Classification (2000) 60F99 60B10 92B20 82C32 
82C80 • 35Q80 

1 Introduction 

Cortical activity displays highly complex behaviors which are often characterized by 
the presence of noise. Reliable responses to specific stimuli often arise at the level 
of population assemblies (cortical areas or cortical columns) featuring a very large 
number of neuronal cells, each of these presenting a highly nonlinear behavior, that 
are interconnected in a very intricate fashion. Understanding the global behavior of 
large-scale neural assemblies has been a great endeavor in the past decades. One of 
the main interests of large-scale modeling is characterizing brain functions, which 
most imaging techniques are recording. Moreover, anatomical data recorded in the 
cortex reveal the existence of structures, such as the cortical columns, with a diame- 
ter of about 50 jim to 1 mm, containing the order of 100 to 100,000 neurons belonging 
to a few different types. These columns have specific functions; for example, in the 
human visual area VI, they respond to preferential orientations of bar- shaped visual 
stimuli. In this case, information processing does not occur at the scale of individual 
neurons but rather corresponds to an activity integrating the individual dynamics of 
many interacting neurons and resulting in a mesoscopic signal arising through aver- 
aging effects, and this effectively depends on a few effective control parameters. This 
vision, inherited from statistical physics, requires that the space scale be large enough 
to include sufficiently many neurons and small enough so that the region considered 
is homogeneous. This is, in effect, the case of the cortical columns. 

In the field of mathematics, studying the limits of systems of particle systems in 
interaction has been a long-standing problem and presents many technical difficulties. 
One of the questions addressed in mathematics was to characterize the limit of the 
probability distribution of an infinite set of interacting diffusion processes, and the 
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fluctuations around the limit for a finite number of processes. The first breakthroughs 
to find answers to this question are due to Henry McKean (see, e.g. [1, 2]). It was 
then investigated in various contexts by a large number of authors such as Braun and 
Hepp [3], Dawson [4] and Dobrushin [5], and most of the theory was achieved by 
Tanaka and collaborators [6-9] and of course Sznitman [10-12]. When considering 
that all particles (in our case, neurons) have the same, independent initial condition, 
they are mathematically proved using stochastic theory (the Wasserstein distance, 
large deviation techniques) that in the limit where the number of particles tends to 
infinity, any finite number of particles behaves independently of the other ones, and 
they all present the same probability distribution, which satisfies a nonlinear Markov 
equation. Finite-size fluctuations around the limit are derived in a general case in [10]. 
Most of these models use a standard hypothesis of global Lipschitz continuity and 
linear growth condition of the drift and diffusion coefficients of the diffusions, as well 
as the Lipschitz continuity of the interaction function. Extensions to discontinuous 
cadlag processes including singular interactions (through a local time process) were 
developed in [11]. Problems involving singular interaction variables (e.g. nonsmooth 
functions) are also widely studied in the field, but are not relevant in our case. 

In the present article, we apply this mathematical approach to the problem of in- 
teracting neurons arising in neuroscience. To this end, we extend the theory to en- 
compass a wider class of models. This implies the use of locally (instead of globally) 
Lipschitz coefficients and of a Lyapunov-like growth condition replacing the custom- 
ary linear growth assumption for some of the functions appearing in the equations. 
The contributions of this article are fourfold: 

1 . We derive, in a rigorous manner, the mean-field equations resulting from the inter- 
action of infinitely many neurons in the case of widely accepted models of spiking 
neurons and synapses. 

2. We prove a propagation of chaos property which shows that in the mean-field 
limit, the neurons become independent, in agreement with some recent experimen- 
tal work [13] and with the idea that the brain processes information in a somewhat 
optimal way. 

3. We show, numerically, that the mean-field limit is a good approximation of the 
mean activity of the network even for fairly small sizes of neuronal populations. 

4. We suggest, numerically, that the changes in the dynamics of the mean-field limit 
when varying parameters can be understood by studying the mean-field Fokker- 
Planck equation. 

We start by reviewing such models in the 'Spiking conductance-based models' sec- 
tion to motivate the present study. It is in the 'Mean-field equations for conductance- 
based models' section that we provide the limit equations describing the behaviors 
of an infinite number of interacting neurons and state and prove the existence and 
uniqueness of solutions in the case of conductance-based models. The detailed proof 
of the second main theorem, that of the convergence of the network equations to the 
mean-field limit, is given in the Appendix. In the 'Numerical simulations' section, we 
begin to address the difficult problem of the numerical simulation of the mean-field 
equations and show some results indicating that they may be an efficient way of rep- 
resenting the mean activity of a finite- size network as well as to study the changes in 
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the dynamics when varying biological parameters. The final 'Discussion and conclu- 
sion' section focuses on the conclusions of our mathematical and numerical results 
and raises some important questions for future work. 

2 Spiking conductance-based models 

This section sets the stage for our results. We review in the 'Hodgkin-Huxley model' 
section the Hodgkin-Huxley model equations in the case where both the membrane 
potential and the ion channel equations include noise. We then proceed in the The 
FitzHugh-Nagumo model' section with the FitzHugh-Nagumo equations in the case 
where the membrane potential equation includes noise. We next discuss in the 'Mod- 
els of synapses and maximum conductances' section the connectivity models of net- 
works of such neurons, starting with the synapses, electrical and chemical, and finish- 
ing with several stochastic models of the synaptic weights. In the 'Putting everything 
together' section, we write the network equations in the various cases considered in 
the previous section and express them in a general abstract mathematical form that 
is the one used for stating and proving the results about the mean-field limits in the 
'Mean-field equations for conductance-based models' section. Before we jump into 
this, we conclude in the 'Mean-field methods in computational neuroscience: a quick 
overview' section with a brief overview of the mean- field methods popular in com- 
putational neuroscience. 

From the mathematical point of view, each neuron is a complex system, whose dy- 
namics is often described by a set of stochastic nonlinear differential equations. Such 
models aim at reproducing the biophysics of ion channels governing the membrane 
potential and therefore the spike emission. This is the case of the classical model of 
Hodgkin and Huxley [14] and of its reductions [15-17]. Simpler models use discon- 
tinuous processes mimicking the spike emission by modeling the membrane voltage 
and considering that spikes are emitted when it reaches a given threshold. These are 
called integrate-and-fire models [18, 19] and will not be addressed here. The models 
of large networks we deal with here therefore consist of systems of coupled nonlinear 
diffusion processes. 

2.1 Hodgkin-Huxley model 

One of the most important models in computational neuroscience is the Hodgkin- 
Huxley model. Using pioneering experimental techniques of that time, Hodgkin and 
Huxley [14] determined that the activity of the giant squid axon is controlled by 
three major currents: voltage-gated persistent K + current with four activation gates, 
voltage-gated transient Na + current with three activation gates and one inactivation 
gate, and Ohmic leak current, /l, which is carried mostly by chloride ions (CI - ). In 
this paper, we only use the space-clamped Hodgkin-Huxley model which we slightly 
generalize to a stochastic setting in order to better take into account the variability 
of the parameters. The advantages of this model are numerous, and one of the most 
prominent aspects in its favor is its correspondence with the most widely accepted 
formalism to describe the dynamics of the nerve cell membrane. A very extensive 
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literature can also be found about the mathematical properties of this system, and it 
is now quite well understood. 

The basic electrical relation between the membrane potential and the currents is 
simply: 

dV 

C-f = / eXt (0-/K-/Na-/L, 

at 

where I ext (t) is an external current. The detailed expressions for /k, ^Na and 7l can 
be found in several textbooks, e.g. [17, 20]: 

lK = gKn 4 (V - £ K ), 
^Na = gNztn 3 h(V - £ Na ), 
/l = Sl(V-£l), 

where #k (respectively, gNa) is the maximum conductance of the potassium (respec- 
tively, the sodium) channel; #l is the conductance of the Ohmic channel; and n (re- 
spectively, m) is the activation variable for K + (respectively, for Na). There are four 
(respectively, three) activation gates for the K + (respectively, the Na) current which 
accounts for the power 4 (respectively, 3) in the expression of /k (respectively ^a)- 
h is the inactivation variable for Na. These activation/deactivation variables, denoted 
by x G {n, m, h] in what follows, represent a proportion (they vary between 0 and 
1) of open gates. The proportions of open channels are given by the functions n 4 
and m 3 h. The proportions of open gates can be computed through a Markov chain 
modeling assuming the gates to open with rate p x (V) (the dependence in V accounts 
for the voltage-gating of the gate) and to close with rate £ X (V). These processes can 
be shown to converge, under standard assumptions, towards the following ordinary 
differential equations: 

x = p x (V)(l -x) - t; x (V)x, x G {n,m,h}. 

The functions p x (V) and £ X (V) are smooth functions whose exact values can be 
found in several textbooks such as the ones cited above. Note that half of these six 
functions are unbounded when the voltage goes to — oo, being of the form k\e~ klV , 
with k\ and k>i as two positive constants. Since these functions have been fitted to ex- 
perimental data corresponding to values of the membrane potential between roughly 
— 100 and 100 mVs, it is clear that extremely large in magnitude and negative val- 
ues of this variable do not have any physiological meaning. We can therefore safely, 
smoothly perturb these functions so that they are upper-bounded by some large (but 
finite) positive number for these values of the membrane potential. Hence, the func- 
tions p x and t; x are bounded and Lipschitz continuous for x e {n, m, h}. A more pre- 
cise model taking into account the finite number of channels through the Langevin 
approximation results in the stochastic differential equation 21 

dx t = (p*(V)(l - x) - SAV)x)dt + VP* (V) (1 - x) + & (V)x X (x) dWf , 



a More precisely, as shown in [79, 80], the convergence is to a larger - 13-dimensional - system with an 
invariant four-dimensional manifold on which the solution lives given appropriate initial conditions. See 
also [81]. 
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Evolution of a single Hodgkin-Huxley neuron without noise Evolution of a single Hodgkin-Huxley neuron without noise 




Time Time 



Fig. 1 Solution of the noiseless Hodgkin-Huxley model. Left: time evolution of the three ion channel 
variables n, m and h. Right: corresponding time evolution of the membrane potential. Parameters are 
given in the text. 



where Wf and x e {n,m,h} are independent standard Brownian motions. is 
a function that vanishes outside (0, 1). This guarantees that the solution remains a 
proportion, i.e. lies between 0 and 1 for all times. We define 

o x (V , x) = jp x (V ) (1 - x) + & (V)x X (x) . (1) 

In order to complete our stochastic Hodgkin-Huxley neuron model, we assume 
that the external current 7 ext (0 is the sum of a deterministic part, noted as I(t), and 
a stochastic part, a white noise with variance a ext built from a standard Brownian 
motion W t independent of Wf and x e {n,m,h}. Considering the current produced 
by the income of ion through these channels, we end up with the following system of 
stochastic differential equations: 

C dV t = (l(t)-g K n 4 (V - E K )-g m m 3 h(V - E m ) - g L (V - E L ))dt 

+ cr ext dW t , (2) 
dx t = (px(V)(l-x)-Zx(V)x)dt + cr x (V,x)dW?, x e {n,m,h}. 

This is a stochastic version of the Hodgkin-Huxley model. The functions p x and £ x 
are bounded and Lipschitz continuous (see discussion above). The functions n, m 
and h are bounded between 0 and 1; hence, the functions n 4 and m 3 h are Lipschitz 
continuous. 

To illustrate the model, we show in Figure 1 the time evolution of the three ion 
channel variables n, m and h as well as that of the membrane potential V for a 
constant input / = 20.0. The system of ordinary differential equations (ODEs) has 
been solved using a Runge-Kutta scheme of order 4 with an integration time step 
At = 0.01. In Figure 2, we show the same time evolution when noise is added to the 
channel variables and the membrane potential. 

For the membrane potential, we have used a ex t = 3.0 (see Equation 2), while for 
the noise in the ion channels, we have used the following x function (see Equation 1): 

X M = \re-w-e*-rt if0<x<h (3) 



0 ifx<0vjc>l 
with T = 0.1 and A = 0.5 for all the ion channels. The system of SDEs has been 
integrated using the Euler-Maruyama scheme with At = 0.01. 
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Evolution of a single Hodgkin-Huxley neuron with noise Evolution of a single Hodgkin-Huxley neuron with noise 




Time Time 



Fig. 2 Noisy Hodgkin-Huxley model. Left: time evolution of the three ion channel variables n, m and h. 
Right: corresponding time evolution of the membrane potential. Parameters are given in the text. 

Because the Hodgkin-Huxley model is rather complicated and high-dimensional, 
many reductions have been proposed, in particular to two dimensions instead of four. 
These reduced models include the famous FitzHugh-Nagumo and Morris-Lecar mod- 
els. These two models are two-dimensional approximations of the original Hodgkin- 
Huxley model based on quantitative observations of the time scale of the dynamics of 
each variable and identification of variables. Most reduced models still comply with 
the Lipschitz and linear growth conditions ensuring the existence and uniqueness of 
a solution, except for the FitzHugh-Nagumo model which we now introduce. 



2.2 The FitzHugh-Nagumo model 



In order to reduce the dimension of the Hodgkin-Huxley model, FitzHugh [15, 16, 21] 
introduced a simplified two-dimensional model. The motivation was to isolate con- 
ceptually essential mathematical features yielding excitation and transmission prop- 
erties from the analysis of the biophysics of sodium and potassium flows. Nagumo 
and collaborators [22] followed up with an electrical system reproducing the dynam- 
ics of this model and studied its properties. The model consists of two equations, one 
governing a voltage-like variable V having a cubic nonlinearity and a slower recovery 
variable w. It can be written as: 



V: 

w ■ 



(4) 



:f(V) ~ W + P Xt 

■ c(V + a — bw), 

where f(V) is a cubic polynomial in V which we choose, without loss of generality, 
to be f(V) = V — V 3 /3. The parameter 7 ext models the input current the neuron 
receives; the parameters a, b > 0 and c > 0 describe the kinetics of the recovery 
variable w. As in the case of the Hodgkin-Huxley model, the current 7 ext is assumed 
to be the sum of a deterministic part, noted /, and a stochastic white noise accounting 
for the randomness of the environment. The stochastic FitzHugh-Nagumo equation 
is deduced from Equation 4 and reads: 

' 3 \ 

)dt + cr ext dW t 



dV t = {V t - 
dw t = c(V t 



V* 



3 
- a - 



-w t + I 
bw t ) dt. 



(5) 
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Time Time 



Fig. 3 Time evolution of the membrane potential and the adaptation variable in the FitzHugh-Nagumo 
model. Left: without noise. Right: with noise. See text. 

Note that because the function f(V) is not globally Lipschitz continuous (only lo- 
cally), the well-posedness of the stochastic differential equation (Equation 5) does 
not follow immediately from the standard theorem which assumes the global Lips- 
chitz continuity of the drift and diffusion coefficients. This question is settled below 
by Proposition 1 . 

We show in Figure 3 the time evolution of the adaptation variable and the mem- 
brane potential in the case where the input / is constant and equal to 0.7. The left- 
hand side of the figure shows the case with no noise while the right-hand side shows 
the case where noise of intensity a ex t = 0-25 (see Equation 5) has been added. 

The deterministic model has been solved with a Runge-Kutta method of order 4, 
while the stochastic model, with the Euler-Maruyama scheme. In both cases, we have 
used an integration time step At = 0.01. 

2.3 Partial conclusion 

We have reviewed two main models of space-clamped single neurons: the Hodgkin- 
Huxley and FitzHugh-Nagumo models. These models are stochastic, including var- 
ious sources of noise: external and internal. The noise sources are supposed to be 
independent Brownian processes. We have shown that the resulting stochastic differ- 
ential Equations 2 and 5 were well-posed. As pointed out above, this analysis extends 
to a large number of reduced versions of the Hodgkin-Huxley such as those that can 
be found in the book [17]. 

2.4 Models of synapses and maximum conductances 

We now study the situation in which several of these neurons are connected to one 
another forming a network, which we will assume to be fully connected. Let N be 
the total number of neurons. These neurons belong to P populations, e.g. pyramidal 
cells or interneurons. If the index of a neuron is i, 1 < i < N, we note p(i) = a, 
1 < oi < P as the population it belongs to. We note N p (i) as the number of neurons 
in population p(i). Since we want to be as close to biology as possible while keeping 
the possibility of a mathematical analysis of the resulting model, we consider two 
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types of simplified, but realistic, synapses: chemical and electrical or gap junctions. 
The following material concerning synapses is standard and can be found in text- 
books [20]. The new, and we think important, twist is to add noise to our models. To 
unify notations, in what follows, i is the index of a postsynaptic neuron belonging to 
population a = p(i), and j is the index of a presynaptic neuron to neuron i belonging 
to population y = p(j). 

2. 4. 1 Chemical synapses 

The principle of functioning of chemical synapses is based on the release of a neu- 
rotransmitter in the presynaptic neuron synaptic button, which binds to specific re- 
ceptors on the postsynaptic cell. This process, similar to the currents described in the 
Hodgkin and Huxley model, is governed by the value of the cell membrane potential. 
We use the model described in [20, 23], which features a quite realistic biophysical 
representation of the processes at work in the spike transmission and is consistent 
with the previous formalism used to describe the conductances of other ion channels. 
The model emulates the fact that following the arrival of an action potential at the 
presynaptic terminal, a neurotransmitter is released in the synaptic cleft and binds 
to the postsynaptic receptor with a first order kinetic scheme. Let j be a presynaptic 
neuron to the postynaptic neuron i . The synaptic current induced by the synapse from 
j to i can be modelled as the product of a conductance gij with a voltage difference: 

I% a = - 8tj (.t)(V i -V&). (6) 

The synaptic reversal potentials are approximately constant within each popula- 
tion: Vrev : = Vrev • The conductance gij is the product of the maximum conductance 
Jij(t) with a function y* (t) that denotes the fraction of open channels and depends 
only upon the presynaptic neuron j : 

gij(t) = Jij(t)y j (t). (7) 

The function y^ it) is often modelled [20] as satisfying the following ordinary differ- 
ential equation: 

yj(t) = 4Sj(V j )(l-y j (t))-a J d yj(t). 

The positive constants a} and a J d characterize the rise and decay rates, respectively, of 
the synaptic conductance. Their values depend only on the population of the presy- 
naptic neuron j, i.e. a} := a y r and a J d := a d , but may vary significantly from one 
population to the next. For example, gamma- aminobutyric acid (GABA) B synapses 
are slow to activate and slow to turn off while the reverse is true for GABAa and 
AMPA synapses [20]. Sj(V^) denotes the concentration of the transmitter released 
into the synaptic cleft by a presynaptic spike. We assume that the function Sj is sig- 
moidal and that its exact form depends only upon the population of the neuron j . Its 
expression is given by (see, e.g. [20]): 

Sy(VJ) = . y ■ (8) 
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Destexhe et al. [23] give some typical values of the parameters r max = 1 mM, W = 
2mVand l/A = 5mV. 

Because of the dynamics of ion channels and of their finite number, similar to the 
channel noise models derived through the Langevin approximation in the Hodgkin- 
Huxley model (Equation 2), we assume that the proportion of active channels is actu- 
ally governed by a stochastic differential equation with diffusion coefficient cr y (V, y) 
depending only on the population y of j of the form (Equation 1): 

dyj = (a^S y (vj)(l-yj(t))-a y d yj(t))dt + ay(vj,yj)dW t Ly . 
In detail, we have 

tf(vj,yj)=Ja?S y (Vj){l-yj)+a r d yj X {y j ). (9) 

Remember that the form of the diffusion term guarantees that the solutions to this 
equation with appropriate initial conditions stay between 0 and 1 . The Brownian mo- 
tions W^ y are assumed to be independent from one neuron to the next. 

2.4.2 Electrical synapses 

The electrical synapse transmission is rapid and stereotyped and is mainly used to 
send simple depolarizing signals for systems requiring the fastest possible response. 
At the location of an electrical synapse, the separation between two neurons is very 
small (~3.5 nm). This narrow gap is bridged by the gap junction channels, special- 
ized protein structures that conduct the flow of ionic current from the presynaptic to 
the postsynaptic cell (see, e.g. [24]). 

Electrical synapses thus work by allowing ionic current to flow passively through 
the gap junction pores from one neuron to another. The usual source of this current 
is the potential difference generated locally by the action potential. Without the need 
for receptors to recognize chemical messengers, signaling at electrical synapses is 
more rapid than that which occurs across chemical synapses, the predominant kind 
of junctions between neurons. The relative speed of electrical synapses also allows 
for many neurons to fire synchronously. 

We model the current for this type of synapse as 

I$ B = J ij (t)(V i -Vl), (10) 
where Jij (t) is the maximum conductance. 

2.4.3 The maximum conductances 

As shown in Equations 6, 7 and 10, we model the current going through the synapse 
connecting neuron j to neuron i as being proportional to the maximum conductance 
Jij. Because the synaptic transmission through a synapse is affected by the nature 
of the environment, the maximum conductances are affected by dynamical random 
variations (we do not take into account such phenomena as plasticity). What kind of 
models can we consider for these random variations? 

The simplest idea is to assume that the maximum conductances are independent 

diffusion processes with mean and standard deviation -^p 1 , i.e. that depend only 
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on the populations. The quantities J ay , being conductances, are positive. We write 
the following equation: 

j (0 = ^Jl + ^.y (0> (11) 

7 Ny Ny S 

where the t; l,y (t), i = l,...,N,y = l,...,P, are NP -independent zero mean unit 
variance white noise processes derived from NP -independent standard Brownian 

motions B^(t), i.e. ^ (t) = which we also assume to be independent of 

all the previously defined Brownian motions. The main advantage of this dynamics 
is its simplicity. Its main disadvantage is that if we increase the noise level a ay , the 
probability that Jjj (t) becomes negative increases also: this would result in a negative 
conductance! 

One way to alleviate this problem is to modify the dynamics (Equation 11) to 
a slightly more complicated one whose solutions do not change sign, such as for 
instance, the Cox-Ingersoll-Ross model [25] given by: 

dJij(t) = 9 a y(^- - Jij(f)jdt + ^JT^dB^it). (12) 

Note that the right-hand side only depends upon the population y = p(j). Let Jij(0) 
be the initial condition, it is known [25] that 

E[7y(0] = Jij(0)e- e ^ + ^(1 - e- e «r% 
Var(/ l7 (0) = W^f(e-^' - e~^') + ^^(l - e~^f. 

iy/ y a ay ZiV y u ay 

This shows that if the initial condition Jij (0) is equal to the mean ^ , the mean of 
the process is constant over time and equal to ^ . Otherwise, if the initial condition 
Jij (0) is of the same sign as J ay , i.e. positive, then the long term mean is ^ and 
the process is guaranteed not to touch 0 if the condition 2NyO ay J a y > (o^) 2 holds 
[25]. Note that the long term variance is a J Ar3 ^ y . 

2.5 Putting everything together 

We are ready to write the equations of a network of Hodgkin-Huxley or FitzHugh- 
Nagumo neurons and study their properties and their limit, if any, when the number 
of neurons becomes large. The external current for neuron i has been modelled as the 
sum of a deterministic part and a stochastic part: 

/e*t (0 = / . (f) + ffia _l. 

We will assume that the deterministic part is the same for all neurons in the same 
population, l[ := I a , and that the same is true for the variance, cr^t := °exr ^ e further 
assume that the N Brownian motions W l t are TV -independent Brownian motions and 
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independent of all the other Brownian motions defined in the model. In other words, 



ir( t )=i a (t)+oi 



a = p(i), i = 1, 



, N . 



(13) 



ext dt 

We only cover the case of chemical synapses and leave it to the reader to derive the 
equations in the simpler case of gap junctions. 

2.5.7 Network of FitzHugh-Nagumo neurons 

We assume that the parameters a\ , b[ and c\ in Equation 5 of the adaptation variable 
w l of neuron i are only functions of the population a = p(i). 

Simple maximum conductance variation. If we assume that the maximum con- 
ductances fluctuate according to Equation 11, the state of the /th neuron in a fully 
connected network of FitzHugh-Nagumo neurons with chemical synapses is deter- 
mined by the variables (V 1 ,w l ,y l ) that satisfy the following set of 3N stochastic 
differential equations: 



dV! = 




V%)yj)dt 



E <{y}-^U)dB\« (14) 

+ <tdw;, 

dw\ = c a (V} +a a — b a w l t )dt, 

dy\ = tfSaivlXl-yfi-aSyfidt+ayivt^dwi". 

Sa(V t l ) is given by Equation 8; 0% , by Equation 9; and W l t ,y , i = 1, . . . , N, are 
independent Brownian processes that model noise in the process of transmitter release 
into the synaptic clefts. 

Sign-preserving maximum conductance variation. If we assume that the maximum 
conductances fluctuate according to Equation 12, the situation is slightly more com- 
plicated. In effect, the state space of the neuron i has to be augmented by the P 
maximum conductances y = 1, . . . , P. We obtain 



dV! 



v!- 



(V/) 3 



-w\+I a (t))dt 



~(E^ E W)(v}-vS!)yi) 

V = l 7 J,P(J)=Y I 



dt 



dw\ = c a (V t l + a a — b a w l t ) dt, 

dy\ = {a a r S a {vl){\-y\)-aly\)dt^al{v\,y\)d^\ 



(15) 



dJi Y (t) 



u ay 

jv7 



Jiy(t)\dt + 



< r 

N v y 



Jiyi^dB^it), Y = l,...,P, 



iy / i«y 

which is a set of N(P + 3) stochastic differential equations. 
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2.5.2 Network of Hodgkin-Huxley neurons 



We provide a similar description in the case of the Hodgkin-Huxley neurons. We 
assume that the functions p l x and x e {n,m,h}, that appear in Equation 2 only 
depend upon a = p(i). 

Simple maximum conductance variation. If we assume that the maximum con- 
ductances fluctuate according to Equation 11, the state of the ith neuron in a fully 
connected network of Hodgkin-Huxley neurons with chemical synapses is therefore 
determined by the variables (V 1 ,n l ,m l , h l , y l ) that satisfy the following set of 5N 
stochastic differential equations: 

CdV* = (I a (t) - gknt(vt - E K ) - g^m]hi{V l t - £ Na ) - g L (v/ - E L ))dt 

j,P(j)=Y 





E <{V}-V2!)AdBi" (16) 

s j,P(j)=Y 
+ < t dW' t , 

dxi(t) = (p%(V')(l-x i )-{ x (V')x i )dt + a x (V',x i )dW?' 1 , xe{n,m,h), 
. dy\ = (a»S a {v!){\ - y]) - a a d y\) dt + o£(v/, y' t ) dW*' y . 

Sign-preserving maximum conductance variation. If we assume that the maximum 
conductances fluctuate according to Equation 12, we use the same idea as in the 
FitzHugh-Nagumo case of augmenting the state space of each individual neuron and 
obtain the following set of (5 + P)N stochastic differential equations: 

CdVl = (I a (t) - gknf(vl - E K ) - gmm]hi(V l t - £ Na ) - - E L )) dt 

V = l 7 j,pU)=Y ' 
+ <tdW}, (17) 
dxi{t) = (p a x (V l t ){\-Xi) - &(V t l )xi)dt + a x (V t l ,Xi)dW?' 1 , x e {n,m,h}, 

dy\ = «Mv/)(i - yi) -"5yl)dty>{vi,yi)dwi- y , 

dJ iy {t) = 0 ay - Jiyit^j dt + ^Jj iy ( t )dB^(t), y = 1, . . . , P. 



2.5.3 Partial conclusion 



Equations 14 to 17 have a quite similar structure. They are well-posed, i.e. given any 
initial condition, and any time T > 0, they have a unique solution on [0, T] which 
is square-integrable. A little bit of care has to be taken when choosing these initial 
conditions for some of the parameters, i.e. n, m and h, which take values between 0 
and 1, and the maximum conductances when one wants to preserve their signs. 

In order to prepare the grounds for the 'Mean-field equations for conductance- 
based models' section, we explore a bit more the aforementioned common struc- 
ture. Let us first consider the case of the simple maximum conductance variations 



Springer 



Page 14 of 50 



Baladron et al. 



for the FitzHugh-Nagumo network. Looking at Equation 14, we define the three- 
dimensional state vector of neuron i to be X\ = (V/, w\, y]). Let us now define 
f a : K x R 3 -> R 3 , a = 1, . . . , P, by 



LtfS a {V}){l-yl)-a5ylJ 
R 3x2 by 



Let us next define g a :ixR- 

u ext 

fo(f,X{)= 0 0 

L 0 o->(V/,y')- 

It appears that the intrinsic dynamics of the neuron i is conveniently described by the 
equation 



dX\ = f a (t,X\)dt + g a (t,Xi) 



dWl 



We next define the functions b ay : R 3 x R 3 -> R 3 , for a, y = 1, . . . , P, by 

Jay \ *t v rev )yt 



bay(X l t , X/) - 

and the function 0 ay : R 3 x R 3 -> R 3xl by 



0 
0 



Pay (X\, Z/) — 



-<(v/- W' 



0 
0 



It appears that the full dynamics of the neuron i, corresponding to Equation 14, can 
be described compactly by 

p 



dXl = f a (t,X\)dt + g a (t,Xl) 



dW l t 
ldW l t > y ] 



ITT E bay(XlX{)dt 



K = l j,P(j)=Y 



1 



(18) 



+ E Ky(xl,Xi)dBy. 

K = l 7 j,p(J)=Y 

Let us now move to the case of the sign-preserving variation of the maximum con- 
ductances, still for the FitzHugh-Nagumo neurons. The state of each neuron is now 
P+ 3 -dimensional: we define X\ = (V/, w\, y\, Jn(t), . . . , Jip(t)). We then define 
the functions / a :Rx R p+3 -> R p+3 , a = 1, . . . , P, by 



■'ay 



^5«(v/)(i-y/)"^ 

^ory 



/ iy (f) ,y = l,...,i> 
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and the functions g a :Rx R p+3 -> r(^+3)x(p+2) by 



8a{t,X\) = 



^ext 
0 

0 
0 



0 



0 

0 
0 

0 



0 
0 
0 



0 
0 
0 



0 



u aP 

Np 



y/JiP(t) 



It appears that the intrinsic dynamics of the neuron i isolated from the other neurons 
is conveniently described by the equation 



dX\ = f a (t,X\)dt + g a (t,X i t ) 



" dWj 

dW l i 
dB\ 



i,y 
t 

U 



LdB' t ' p A 



Let us finally define the functions b ay : R p+3 x R p+3 -> R p+3 , for a, y = 1, 
by 

-Jij(t)(vj-V£!)yi^ 



P, 



bay(X l t , Xj) — 



0 



0 



It appears that the full dynamics of the neuron i, corresponding to Equation 15 can 
be described compactly by 

" dWj n 

_ dW l t ' y 

dX i t =f a (t,X i t )dt + g a (t,X i t ) ' 



dB, 



LdB l t 



i,P 



(19) 



y=\ 



j,P(j)=Y 



We let the reader apply the same machinery to the network of Hodgkin-Huxley 
neurons. 

Let us note d as the positive integer equal to the dimension of the state space in 
Equation 18 (d = 3) or 19 id — 3 + P) or in the corresponding cases for the Hodgkin- 
Huxley model ( d = 5 and d = 5 + P). The reader will easily check that the following 
four assumptions hold for both models: 

(HI) Locally Lipschitz dynamics: For all a e {1, . . . , P}, the functions f a and g a are 
uniformly locally Lipschitz continuous with respect to the second variable. In 
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detail, for all U > 0, there exists Kjj > 0 independent of t e [0, T] such that 
for all x, y e B*, the ball of R d of radius U: 

Wfad, x) - f a (t, y)\\ + \\g a (t, x) - g a (t, y)\\ <Ku\\x-y\\, a = 1, . . . , P. 

(H2) Locally Lipschitz interactions: For all a, y e {1, . . . , P], the functions b ay and 
Pay are locally Lipschitz continuous. In detail, for all U > 0, there exists Ljj > 
0 such that for all x, y, x f , y' e By, we have: 

\\b ay (x, y) - b ay (x f , /) I + \\p ay (x, y) - p ay (x', y f ) || 
SLudx-x'W + Wy-y'l). 

(H3) Linear growth of the interactions: There exists a K > 0 such that 

max(||^ K (x,z)|| 2 , ||^ K (x,z)|| 2 )<^(l + ||x|| 2 ). 

(H4) Monotone growth of the dynamics: We assume that f a and g a satisfy the fol- 
lowing monotonous condition for all a = 1 , . . . , P : 

x T f a (t,x) + ^\\g a (t,x)f < K(l + ||x|| 2 ). (20) 

These assumptions are central to the proofs of Theorems 2 and 4. 

They imply the following proposition stating that the system of stochastic differ- 
ential equations (Equation 19) is well-posed: 

Proposition 1 Let T > 0 be a fixed time. If\I a (t) \ < I m on [0, T],for a = 1, . . . , P, 
Equations 18 and 19 together with an initial condition X l 0 e L 2 (R^), i = 1, . . . , N of 
square -integrable random variables, have a unique strong solution which belongs to 
L 2 ([0,T];R dN ). 

Proof The proof uses Theorem 3.5 in chapter 2 in [26] whose conditions are easily 
shown to follow from hypotheses 2.5.3 to (H2). □ 

The case N = I implies that Equations 2 and 5, describing the stochastic 
FitzHugh-Nagumo and Hodgkin-Huxley neurons, are well-posed. 

We are interested in the behavior of the solutions of these equations as the number 
of neurons tends to infinity. This problem has been long-standing in neuroscience, 
arousing the interest of many researchers in different domains. We discuss the differ- 
ent approaches developed in the field in the next subsection. 

2.6 Mean-field methods in computational neuroscience: a quick overview 

Obtaining the equations of evolution of the effective mean-field from microscopic dy- 
namics is a very complex problem. Many approximate solutions have been provided, 
mostly based on the statistical physics literature. 

Many models describing the emergent behavior arising from the interaction of 
neurons in large-scale networks have relied on continuum limits ever since the semi- 
nal work of Amari, and Wilson and Cowan [27-30]. Such models represent the activ- 
ity of the network by macroscopic variables, e.g. the population-averaged firing rate, 
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which are generally assumed to be deterministic. When the spatial dimension is not 
taken into account in the equations, they are referred to as neural masses, otherwise as 
neural fields. The equations that relate these variables are ordinary differential equa- 
tions for neural masses and integrodifferential equations for neural fields. In the sec- 
ond case, they fall in a category studied in [31] or can be seen as ordinary differential 
equations defined on specific functional spaces [32]. Many analytical and numerical 
results have been derived from these equations and related to cortical phenomena, for 
instance, for the problem of spatio-temporal pattern formation in spatially extended 
models (see, e.g. [33-36]). The use of bifurcation theory has also proven to be quite 
powerful [37, 38]. Despite all its qualities, this approach implicitly makes the as- 
sumption that the effect of noise vanishes at the mesoscopic and macroscopic scales 
and hence that the behavior of such populations of neurons is deterministic. 

A different approach has been to study regimes where the activity is uncorrelated. 
A number of computational studies on the integrate-and-fire neuron showed that un- 
der certain conditions, neurons in large assemblies end up firing asynchronously, 
producing null correlations [39-41]. In these regimes, the correlations in the firing 
activity decrease towards zero in the limit where the number of neurons tends to in- 
finity. The emergent global activity of the population in this limit is deterministic 
and evolves according to a mean-field firing rate equation. However, according to the 
theory, these states only exist in the limit where the number of neurons is infinite, 
thereby raising the question of how the finiteness of the number of neurons impacts 
the existence and behavior of asynchronous states. The study of finite-size effects for 
asynchronous states is generally not reduced to the study of mean firing rates and 
can include higher order moments of firing activity [42-44] . In order to go beyond 
asynchronous states and take into account the stochastic nature of the firing and un- 
derstand how this activity scales as the network size increases, different approaches 
have been developed, such as the population density method and related approaches 
[45]. Most of these approaches involve expansions in terms of the moments of the 
corresponding random variables, and the moment hierarchy needs to be truncated 
which is not a simple task that can raise a number of difficult technical issues (see, 
e.g. [46]). 

However, increasingly many researchers now believe that the different intrinsic or 
extrinsic noise sources are part of the neuronal signal, and rather than being a pure 
disturbing effect related to the intrinsically noisy biological substrate of the neural 
system, they suggest that noise conveys information that can be an important principle 
of brain function [47]. At the level of a single cell, various studies have shown that 
the firing statistics are highly stochastic with probability distributions close to the 
Poisson distributions [48], and several computational studies confirmed the stochastic 
nature of single-cell firings [49-51]. How the variability at the single-neuron level 
affects the dynamics of cortical networks is less well established. Theoretically, the 
interaction of a large number of neurons that fire stochastic spike trains can naturally 
produce correlations in the firing activity of the population. For instance, power laws 
in the scaling of avalanche- size distributions has been studied both via models and 
experiments [52-55]. In these regimes, the randomness plays a central role. 

In order to study the effect of the stochastic nature of the firing in large networks, 
many authors strived to introduce randomness in a tractable form. Some of the models 
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proposed in the area are based on the definition of a Markov chain governing the fir- 
ing dynamics of the neurons in the network, where the transition probability satisfies 
a differential equation, the master equation. Seminal works of the application of such 
modeling for neuroscience date back to the early 1990s and have been recently de- 
veloped by several authors [43, 56-59]. Most of these approaches are proved correct 
in some parameter regions using statistical physics tools such as path integrals and 
Van-Kampen expansions, and their analysis often involve a moment expansion and 
truncation. Using a different approach, a static mean-field study of multi-population 
network activity was developed by Treves in [60] . This author did not consider exter- 
nal inputs but incorporated dynamical synaptic currents and adaptation effects. His 
analysis was completed in [39], where the authors proved, using a Fokker-Planck 
formalism, the stability of an asynchronous state in the network. Later on, Gerst- 
ner in [61] built a new approach to characterize the mean-field dynamics for the spike 
response model, via the introduction of suitable kernels propagating the collective ac- 
tivity of a neural population in time. Another approach is based on the use of large de- 
viation techniques to study large networks of neurons [62] . This approach is inspired 
by the work on spin-glass dynamics, e.g. [63]. It takes into account the randomness 
of the maximum conductances and the noise at various levels. The individual neuron 
models are rate models, hence already mean-field models. The mean-field equations 
are not rigorously derived from the network equations in the limit of an infinite num- 
ber of neurons, but they are shown to have a unique, non-Markov solution, i.e. with 
infinite memory, for each initial condition. 

Brunei and Hakim considered a network of integrate-and-fire neurons connected 
with constant maximum conductances [41]. In the case of sparse connectivity, sta- 
tionarity, and in a regime where individual neurons emit spikes at a low rate, they 
were able to analytically study the dynamics of the network and to show that it ex- 
hibits a sharp transition between a stationary regime and a regime of fast collective 
oscillations weakly synchronized. Their approach was based on a perturbative analy- 
sis of the Fokker-Planck equation. A similar formalism was used in [44] which, when 
complemented with self-consistency equations, resulted in the dynamical description 
of the mean-field equations of the network and was extended to a multi population 
network. Finally, Chizhov and Graham [64] have recently proposed a new method 
based on a population density approach allowing to characterize the mesoscopic be- 
havior of neuron populations in conductance-based models. 

Let us finish this very short and incomplete survey by mentioning the work of 
Sompolinsky and colleagues. Assuming a linear intrinsic dynamics for the individual 
neurons described by a rate model and random centered maximum conductances for 
the connections, they showed [65, 66] that the system undergoes a phase transition 
between two different stationary regimes: a 'trivial' regime where the system has 
a unique null and uncorrected solution, and a 'chaotic' regime in which the firing 
rate converges towards a non-zero value and correlations stabilize on a specific curve 
which they were able to characterize. 

All these approaches have in common that they are not based on the most widely 
accepted microscopic dynamics (such as the ones represented by the Hodgkin-Huxley 
equations or some of their simplifications) and/or involve approximations or moment 
closures. Our approach is distinct in that it aims at deriving rigorously and without 
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approximations the mean-field equations of populations of neurons whose individual 
neurons are described by biological, if not correct at least plausible, representations. 
The price to pay is the complexity of the resulting mean-field equations. The spe- 
cific study of their solutions is therefore a crucial step, which will be developed in 
forthcoming papers. 



3 Mean-field equations for conductance-based models 

In this section, we give a general formulation of the neural network models introduced 
in the previous section and use it in a probabilistic framework to address the problem 
of the asymptotic behavior of the networks, as the number of neurons N goes to 
infinity. In other words, we derive the limit in law of N -interacting neurons, each of 
which satisfying a nonlinear stochastic differential equation of the type described in 
the 'Spiking conductance-based models' section. In the remainder of this section, we 
work in a complete probability space (Q, J 7 , P) satisfying the usual conditions and 
endow with a filtration (J-'t)t- 

3.1 Setting of the problem 

We recall that the neurons in the network fall into different populations P. The pop- 
ulations differ through the intrinsic properties of their neurons and the input they 
receive. We assume that the number of neurons in each population a e {1, . . . , P}, 
denoted by N a , increases as the network size increases and moreover that the asymp- 
totic proportion of neurons in population a is nontrivial, i.e. N a /N — > k a e (0, 1) as 
N goes to infinity 13 . 

We use the notations introduced in the 'Partial conclusion' section, and the reader 
should refer to this section to give a concrete meaning to the rather abstract (but 
required by the mathematics) setting that we now establish. 

Each neuron i in population a is described by a state vector noted as X l t ,N in R d 
and has an intrinsic dynamics governed by a drift function / a :RxR^^ and a 
diffusion matrix g a : R x R d \-> M. dxm assumed uniformly locally Lipschitz continu- 
ous with respect to the second variable. For a neuron i in population a, the dynamics 
of the d -dimensional process (X\) governing the evolution of the membrane potential 
and additional variables (adaptation, ionic concentrations), when there is no interac- 
tion, is governed by the equation: 

dx\- N = f a (t, r t ' N ) dt + ga (t, xi' N ) dw>;. 

Moreover, we assume, as it is the case for all the models described in the 'Spiking 
conductance-based models' section, that the solutions of this stochastic differential 
equation exist for all time. 



b As we will see in the proof, most properties are valid as soon as N a tends to infinity as N goes to infinity 
for all a e {1, . . . , P}, the previous assumption will allow quantifying the speed of convergence towards 
the asymptotic regime. 
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When included in the network, these processes interact with those of all the other 
neurons through a set of continuous functions that only depend on the population 
a = p(i), the neuron i belongs to and the populations y of the presynaptic neurons. 
These functions, b ay (x, y) : R d x R d i-> R d , are scaled by the coefficients 1/N y , 
so the maximal interaction is independent of the size of the network (in particular, 
neither diverging nor vanishing as N goes to infinity). 

As discussed in the 'Spiking conductance-based models' section, due to the 
stochastic nature of ionic currents and the noise effects linked with the discrete nature 
of charge carriers, the maximum conductances are perturbed dynamically through the 
N x P -independent Brownian motions B l t ,a of dimension 8 that were previously in- 
troduced. The interaction between the neurons and the noise term is represented by 
the function p ay : R d x R d \-> R dxS . 

In order to introduce the stochastic current and stochastic maximum conductances, 
we define two independent sequences of independent m- and 8 -dimensional Brown- 
ian motions noted as (W})i e ® and (^ a )i G N,ae{i-P} which are adapted to the filtra- 
tion Tt • 

The resulting equation for the /th neuron, including the noisy interactions, reads: 
dX\> N = f a (t,X i t ' N )dt + Y j ±- J2 b aY {X\- N ,xi' N )dt 

y=l y^p{j)=y (21) 

+ ga (t,X i t < N )dW i 1 +Y j ±- £ P a y(V N ,xi< N )dB7. 

y=l 7 j,p(j)=Y 

Note that this implies that X l > N and X^ N have the same law whenever p(i) = p(j), 
given identically distributed initial conditions. 

These equations are similar to the equations studied in another context by a num- 
ber of mathematicians, among which are McKean, Tanaka and Sznitman (see the 
'Introduction' section), in that they involve a very large number of particles (here, 
particles are neurons) in interaction. Motivated by the study of the McKean- Vlasov 
equations, these authors studied special cases of equations (Equation 21). This theory, 
referred to as the kinetic theory, is chiefly interested in the study of the thermodynam- 
ics questions. They show the property that in the limit where the number of particles 
tends to infinity, provided that the initial state of each particle is drawn independently 
from the same law, each particle behaves independently and has the same law, which 
is given by an implicit stochastic equation. They also evaluate the fluctuations around 
this limit under diverse conditions [1, 2, 6, 7, 9-11]. Some extensions to biological 
problems where the drift term is not globally Lipschitz but satisfies the monotone 
growth condition (Equation 20) were studied in [67] . This is the approach we under- 
take here. 



3.2 Convergence of the network equations to the mean-field equations and 
properties of those equations 

We now show that the same type of phenomena that were predicted for systems of 
interacting particles happen in networks of neurons. In detail, we prove that in the 
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limit of large populations, the network displays the property of propagation of chaos. 
This means that any finite number of diffusion processes become independent, and all 
neurons belonging to a given population a have asymptotically the same probability 
distribution, which is the solution of the following mean-field equation: 

p 

dXf = f a {t, Xf)dt + J2®z[ b *y(X?> Zt)]dt 

7=1 p (22) 

+ ga (t, x°) dw? + 5>z[M*?' %)] dB ? y > <* = h--.,p, 

y=l 

where Z is a process independent of X that has the same law, and denotes the ex- 
pectation under the law of Z. In other words, the mean-field equation can be written, 
denoting by dm] (z) the law of Z Y t (hence, also of X y t ): 

p 



dXf = f a (t,Xf)dt + Y i (f b ay (Xf , z) dm] (z)) dt 

p 

+ g a (t, Xf) dW? + J2 (j Rd P"Y ' Z ) dm < dB t Y ' 



y=l 

In these equations, W", for a = 1 • • • P, are independent, standard, m -dimensional 
Brownian motions. Let us point out the fact that the right-hand side of Equations 22 
and 23 depends on the law of the solution; this fact is sometimes referred to as 'the 
process X is attracted by its own law'. This equation is also classically written as 
the McKean-Vlasov-Fokker-Planck equation on the probability distribution p of the 
solution. This equation which we use in the 'Numerical simulations' section can be 
easily derived from Equation 22. Let p a (t,z), z = (zi, . . . , Zd), be the probability 
density at time t of the solution Xf to Equation 22 (this is equivalent to dmf (z) = 
p a (t, z) dz), then we have: 



d t p a (t,z) = - div z ^/ a (f,z) + bay(z,y)p y (t,y)dy S jp a (t,z) ^ 

where the d x d matrix D a is given by 
p 

D a ( Z ) = J2 E z[Pa Y (z, Z)]El[p aY (z, Z)] + g a (t , z)gl \t , z) 
y=\ 

with 



(24) 



E z [Pa Y (z,Z)] = J /3 ay (z,y)p } 



(t,y)dy. 



The P equations (Equation 24) yield the probability densities of the solutions Xf 
of the mean-field equations (Equation 22). Because of the propagation of chaos re- 
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suit, the X? are statistically independent, but their probability functions are clearly 
functionally dependent. 

We now spend some time on notations in order to obtain a somewhat more 
compact form of Equation 22. We define X t to be the d P -dimensional process 
X t = (Xf; a = 1 • • • P). We similarly define f,g,b and ft as the concatenations of 
functions f a , g a , b a ^ and fi a ,y, respectively. In details, f(t, X) = (f a (t, Xf); a = 
1 • • • P), b(X, Y) = (J2y=i b a y(X a , Y Y )\ a = 1 • • • P) and W = (W a ; a = 1 • ■ ■ P). 
The term of noisy synaptic interactions requires a more careful treatment. We define 
£ = (/3 a y ; a, y = 1 • • • P) e (R dx8 ) PxP and B = (fl 0 ^; a, y = 1 • • • P) e (R 8 ) PxP , 
and the product O of an element M e ($l dx8 ) PxP and an element X e (R s ) PxP as 
the element of (R d ) p : 

(MQX) a = J2 M ayX ay . 

y 

We obtain the equivalent compact mean-field equation: 
dX t = (f(t, X t )+Ez[b(X t , Z t )])dt 

(25) 

+ g(t,X t )dW t +Ez[P(X t ,Z t )] QdB t . 

Equations 22 and 24 are implicit equations on the law of X t . 

We now state the main theoretical results of the paper as two theorems. The first 
theorem is about the well-posedness of the mean-field equation (Equation 22). The 
second is about the convergence of the solutions of the network equations to those 
of the mean-field equations. Since the proof of the second theorem involves similar 
ideas to those used in the proof of the first, it is given in the Appendix. 

Theorem 2 Under assumptions (HI) to (H4), there exists a unique solution to the 
mean- field equation (Equation 22) on [0, T] for any T > 0. 

Let us denote by M(C) the set of probability distributions on C the set continuous 
functions [0, T] h> (R d ) p , and M 2 (C) the space of square-integrable processes. Let 
(W a ; a = 1 • • • P) (respectively, (B ay ; a, y = 1 • • • P)) also be a family of P (respec- 
tively, /^-independent, m (respectively 5)-dimensional, adapted standard Brownian 
motions on (Q, J 7 , P). Let us also note Xo e M(R d ) p as the (random) initial con- 
dition of the mean-field equation. We introduce the map O acting on stochastic pro- 
cesses and defined by: 

(M(C)^M(C), 
Xh> (Y t = {Y^a = \-'-P)) t with 

Y? = n + [ (fa(s, X?) +pE z [b aY (XZ, Z?)] j ds 

+ f ga {s,X«)dWf 
Jo 

p r t 

+ T E z [Pay(X?,Zl)]dBf y , a = l,...,P. 

y=l J ° 
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We have introduced in the previous formula the process Z t with the same law as and 
independent of X t . There is a trivial identification between the solutions of the mean- 
field equation (Equation 22) and the fixed points of the map O : any fixed point of O 
provides a solution for Equation 22, and conversely, any solution of Equation 22 is a 
fixed point of O. 

The following lemma is useful to prove the theorem: 

Lemma 3 Let Xo e ~h 2 ((M, d ) p ) be a square-integrable random variable. Let X be 
a solution of the mean-field equation {Equation 22) with initial condition X$. Un- 
der assumptions (H3) and (H4), there exists a constant C(T) > 0 depending on the 
parameters of the system and on the horizon T, such that: 

E[\\X t f]<C(T), Vf e[0, T]. 
Proof Using the ltd formula for \\X t || 2 , we have: 

ll^ll 2 = IIXoll 2 + 2 jT (xjf(s, X s ) + l - \\g(s, X s )f 

+ xjE z [b(X s , Zs)] + \ \\E Z [P(X S9 Z s )] || 2 ) ds + N t , 

where N t is a stochastic integral, hence with a null expectation, E[N t ] = 0. 

This expression involves the term x T b(x,z). Because of assumption (H3), we 
clearly have: 

\x T b(x, z)\ < \\x\\ \\b(x, z) I < 11x11^(1 + ||x|| 2 ) < yfk{\ + ||x|| 2 ). 

It also involves the term x T f(t, x) + \ x) || 2 which, because of assumption (H4), 
is upperbounded by ^(1 + || jc || 2 ). Finally, assumption (H3) again allows us to upper- 
bound the term ^\\E z [fi(X s , Z s )]\\ 2 by f (1 + \\X S \\ 2 ). 
Finally, we obtain 

E[l + \\X t \\ 2 ] < E[l + UXoll 2 ] +2^ + | + JI^J jf E[l + ||X,|| 2 ]^. 

Using Gronwall's inequality, we deduce the L 2 boundedness of the solutions of the 
mean-field equations. □ 

This lemma puts us in a position to prove the existence and uniqueness theorem: 

Proof We start by showing the existence of solutions and then prove the uniqueness 
property. We recall that by the application of Lemma 3, the solutions will all have 
bounded second-order moment. 

Existence. Let X° = (X® = {X® a , a = 1 • • • P}) e M(C) be a given stochastic pro- 
cess, and define the sequence of probability distributions (X k )k>o on M(C) defined 
by induction by X k+l = <&(X k ). Define also a sequence of processes Z k , k > 0, in- 
dependent of the sequence of processes X k and having the same law. We note this as 
'X and Z i.i.d.' below. We stop the processes at the time the first hitting time of 
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the norm of X k to the constant value U . For convenience, we will make an abuse of 
notation in the proof and denote X k = X k ^ k . This implies that X k belongs to B^, 

the ball of radius U centered at the origin in R d , for all times t e [0, T]. 

Using the notations introduced for Equation 25, we decompose the difference 
X k+1 - X k as follows: 



Jo 



A t 



+ f\ z [b(xlz k s )-b(X k -\z^)]ds 
Jo 



+ [\g(s,X*)-g(s,X k - l ))dW 5 
Jo 



c t 



+ [ f E z [p(X k s , Z k s ) - p(X k -\ Z k s - l )]odB s 
Jo 



and find an upperbound for M k := E[sup 5< ^ — X k || 2 ] by finding upperbounds 

for the corresponding norms of the four terms A t , B t , C t and D r . Applying the dis- 
crete Cauchy-Schwartz inequality, we have: 

H^+l _ x k f < 4(\\A t \\ 2 + \\B t \\ 2 + UGH 2 + || All 2 ) 

and treat each term separately. The upperbounds for the first two terms are ob- 
tained using the Cauchy-Schwartz inequality, those of the last two terms using the 
Burkholder-Davis-Gundy martingale moment inequality. 

The term A t is easily controlled using the Cauchy-Schwarz inequality and the use 
of assumption (HI): 

HAj 2 <^ 2 r ( S \\x k u -x k u - l fdu. 

Jo 

Taking the sup of both sides of the last inequality, we obtain 

sup||AJ 2 <^ 2 r f \X k -X k - l \ 2 ds<KlT f sup\\X k - X^fds, 

s<t Jo Jo u<s 

from which follows the fact that 

E[sup||^|| 2 l < K 2 V T f E\ Sa p\\X k u -X k u - l \\ 2 ]ds. 

L s<t J JO L u<s J 

The term B t is controlled using the Cauchy-Schwartz inequality, assumption (H2), 
and the fact that the processes X and Z are independent with the same law: 

|| B s || 2 < 2TL 2 V f - X k ~ l f + E[\\X k u - X k ~ l || 2 ]) du. 
Jo 
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Taking the sup of both sides of the last inequality, we obtain 

pi 

sup||B,|| 2 <2r4 / (supl^-X^-'f + ErsuplX^-X^ 1 ! 2 !)^, 

s<t JO x u<s L u<s J/ 



from which follows the fact that 



E[sup II B, || 2 1 < 4TLy f E\sup\\X k u - X k u 

L s<t J JO L u<s 



1 II 2" 



ds. 



The term C t is controlled using the fact that it is a martingale and applying the 
Burkholder-Davis-Gundy martingale moment inequality and assumption (HI): 



E sup ||C, 

\-s<t 



<4K 



I rEfsuplxJ-Z*- 1 ! 2 " 

«/0 L u<s 



ds. 



The term D t is also controlled using the fact that it is a martingale and applying the 
Burkholder-Davis-Gundy martingale moment inequality and assumption (H2): 



E 



nt 

sup||D,|| 2 l<164 / Efsup||xi - Xj- 1 1| 2 1 rf*. 

■s<t J Jo L u<s J 



Putting all of these together, we get: 
Er S up|X* +1 -Xj| 2 l 

*-s<t J 



: 4(T + 4)(K 2 V + 4L 2 ) f e[su P || X k u - X k ~ l \\ 2 ] ds. 

JO L u<s - 1 



(26) 



From the relation M k < K" M k ~ l ds with K" = 4(T + 4)(K* + 4L* ), we get 
by an immediate recursion: 



M k < 



(K'f / .../ M® ds\--- dsk 
Jo Jo Jo 



{K") k t k 

k\ 



(27) 



■My 



and Mj is finite because the processes are bounded. The Bienayme-Tchebychev in- 
equality and Equation 27 now give 



F(sup||^ +1 -^|| 2 > 

\s<t 



1 



22(^+1) 



(4K"t) k n 
<4- -M% 



and this upper bound is the term of a convergent series. The Borel-Cantelli lemma 
stems that for almost any co e £2, there exists a positive integer ko(co) (co denotes an 
element of the probability space £2) such that 

1 



and hence 



sup||^ +1 -^|| 2 < 



sup||^ +1 -X k \\ < 



22(k+l) ' 



Vk > k 0 (co) 



2 k+\ 



Vk>k 0 (co). 
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It follows that with probability I, the partial sums: 

n 

x^j2( x t +1 ~ x t)= x t 

k=0 

are uniformly (in t G [0, T]) convergent. Denote the thus defined limit by X t . It is 
clearly continuous and ^-adapted. On the other hand, the inequality (Equation 27) 
shows that for every fixed t, the sequence {X"} n >\ is a Cauchy sequence in L 2 . 
Lemma 3 shows that X g M 2 (C). 

It is easy to show using routine methods that X indeed satisfies Equation 22. 

To complete the proof, we use a standard truncation property. This method re- 
places the function / by the truncated function: 



f u xf 11*11 
fu{t ' X) -\f{t,Ux/\\x\\), \\x\\>U, 



and similarly for g. The functions fjj and gu are globally Lipchitz continuous; hence, 
the previous proof shows that there exists a unique solution Xjj to equations (Equa- 
tion 22) associated with the truncated functions. This solution satisfies the equation 

Xuit) = X 0 + [\fu(t, Xu(s)) + Ez[b(Xu(s), Z s )]) ds 
Jo 

(28) 

+ f g u (t,X u (s))dW s + [ Ez[p(Xu(s),Z s )]odB s , t g [0, T], 
Jo Jo 

Let us now define the stopping time as 

r u= M{t g[0, T], \\Xu(t)\\ >U}. 

It is easy to show that 

X v (t) = X V f(t) if 0 < t < tu, U' > U, (29) 

implying that the sequence of stopping times xjj is increasing. Using Lemma 3 which 
implies that the solution to Equation 22 is almost surely bounded, for almost all co e 
£"2, there exists Uo(co) such that xjj = T for all U > Uq. Now, define X(t) = Xu 0 (t), 
t G [0, T]. Because of Equation 29, we have X(t A xjj) = Xjj(t A tjj), and it follows 
from Equation 28 that 

X{t^ru) = X Q + / (fu{s,X s )+Ez[b{X s ,Z s )\)ds+ / gu (s,X s )dW s 
Jo Jo 

+ / Ez[p(Xu(s),Z s )]odB s 
Jo 

rtAXu _ _ _ CtAXu 

= X 0 + (f(s,X s )+Ez[b(X s ,Z s )])ds+ / g(s,X s )dW s 

Jo Jo 

ntAxu 

+ / Ez[p(Xu(s),Z s )]odB s , 
Jo 

and letting U —> oo, we have shown the existence of solution to Equation 22 which, 
by Lemma 3, is square-integrable. 
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Uniqueness. Assume that X and Y are two solutions of the mean-field equations 
(Equation 22). From Lemma 3, we know that both solutions are in M 2 (C). Moreover, 
using the bound Equation 26, we directly obtain the inequality: 



We have proved the well-posedness of the mean-field equations. It remains to show 
that the solutions to the network equations converge to the solutions of the mean-field 
equations. This is what is achieved in the next theorem. 

Theorem 4 Under assumptions (HI) to (H4), the following holds true: 

• Convergence 0 : For each neuron i of population a, the law of the multidimensional 
process X l,N converges towards the law of the solution of the mean-field equation 
related to population a, namely X a . 

• Propagation of chaos: For any k e N*, and any k-tuple (i\, . . . , the law of the 
process (X l t l,N , ... , X l t n,N , t < T) converges towards^ raf ^ 0 • • • 0 m^ ln \ i.e. 
the asymptotic processes have the law of the solution of the mean-field equations 
and are all independent. 

This theorem has important implications in neuroscience that we discuss in the 
'Discussion and conclusion' section. Its proof is given in the Appendix. 

4 Numerical simulations 

At this point, we have provided a compact description of the activity of the network 
when the number of neurons tends to infinity. However, the structure of the solutions 
of these equations is complicated to understand from the implicit mean-field equa- 
tions (Equation 22) and of their variants (such as the McKean-Vlasov-Fokker-Planck 
equations (Equation 24)). In this section, we present some classical ways to numer- 
ically approximate the solutions to these equations and give some indications about 
the rate of convergence and the accuracy of the simulation. These numerical schemes 
allow us to compute and visualize the solutions. We then compare the results of the 
two schemes for a network of FitzHugh-Nagumo neurons belonging to a single pop- 
ulation and show their good agreement. 

The main difficulty one faces when developing numerical schemes for Equa- 
tions 22 and 24 is that they are non-local. By this, we mean that in the case of the 



c The type of convergence is specified in the proof given in the Appendix. 
d The notation mf was introduced right after Equation 22. 




which, by Gronwall's theorem, directly implies that 




s<t 



which ends the proof. 



□ 
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McKean-Vlasov equations, they contain the expectation of a certain function under 
the law of the solution to the equations (see Equation 22). In the case of the cor- 
responding Fokker-Planck equation, it contains integrals of the probability density 
functions which is a solution to the equation (see Equation 24). 

4.1 Numerical simulations of the McKean-Vlasov equations 

The fact that the McKean-Vlasov equations involve an expectation of a certain func- 
tion under the law of the solution of the equation makes them particularly hard to 
simulate directly. One is often reduced to use Monte Carlo simulations to compute 
this expectation, which amounts to simulating the solution of the network equations 
themselves (see [68]). This is the method we used. In its simplest fashion, it consists 
of a Monte Carlo simulation where one numerically solves the N network equations 
(Equation 21) with the classical Euler-Maruyama method a number of times with dif- 
ferent initial conditions, and averages the trajectories of the solutions over the number 
of simulations. 

In detail, let At > 0 and ]VgN*. The discrete-time dynamics implemented in 
the stochastic numerical simulations consists of simulating M times a P -population 
discrete-time process (X l n ,n < T/At,i = 1 • • • N), solution of the recursion, for i in 
population a : 

*£l = ^ + A ' { fa K") dt + J2 ^ E b *Y > *n r ) ] 

I y = l y j=l,p(j)=y J 

+ E M^n-Cil 

y=l y j=l,pU)=Y J 

where and ^ K ' r are independent d- and 8 -dimensional standard normal random 
variables. The initial conditions X l ^ r , i = 1, . . . , N, are drawn independently from 
the same law within each population for each Monte Carlo simulation r = 1, . . . , M. 
One then chooses one neuron i a in each population a = 1 , . . . , P . If the size N of 
the population is large enough, Theorem 4 states that the law, noted as p a (t, X), 
of X la should be close to that of the solution X a of the mean-field equations for 
a = 1 , . . . , P . Hence, in effect, simulating the network is a good approximation (see 
below) of the simulation of the mean-field or McKean-Vlasov equations [68, 69]. 
An approximation of p a (t, X) can be obtained from the Monte Carlo simulations by 
quantizing the phase space and incrementing the count of each bin whenever the tra- 
jectory of the i a neuron at time t falls into that particular bin. The resulting histogram 
can then be compared to the solution of the McKean-Vlasov-Fokker-Planck equation 
(Equation 24) corresponding to population a whose numerical solution is described 
next. 

The mean square error between the solution of the numerical recursion (Equa- 
tion 30) X l n and the solution of the mean-field equations (Equation 22) X l nAt is of 
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order 0(\/~Kt + 1/a/AT), the first term being related to the error made by approxi- 
mating the solution of the network of size N, X l ^ t by an Euler-Maruyama method, 
and the second term, to the convergence of X l ^ t towards the mean-field equation 
X l nAt when considering globally Lipschitz continuous dynamics (see proof of The- 
orem 4 in the Appendix). In our case, as shown before, the dynamics is only locally 
Lipschitz continuous. Finding efficient and provably convergent numerical schemes 
to approximate the solutions of such stochastic differential equations is an area of 
active research. There exist proofs that some schemes are divergent [70] or conver- 
gent [71] for some types of drift and diffusion coefficients. Since our equations are 
not included in either case, we conjecture convergence since we did not observe any 
divergence and leave the proof for future work. 

4.2 Numerical simulations of the McKean-Vlasov-Fokker-Planck equation 

For solving the McKean-Vlasov-Fokker-Planck equation (Equation 24), we have 
used the method of lines [72, 73]. Its basic idea is to discretize the phase space and to 
keep the time continuous. In this way, the values p a (t, X), a = 1, . . . , P of the prob- 
ability density function of population a at each sample point X of the phase space 
are the solutions of P ODEs where the independent variable is the time. Each sample 
point in the phase space generates P ODEs, resulting in a system of coupled ODEs. 
The solutions to this system yield the values of the probability density functions p a 
solution of (Equation 24) at the sample points. The computation of the integral terms 
that appear in the McKean-Vlasov-Fokker-Planck equation is achieved through a re- 
cursive scheme, the Newton-Cotes method of order 6 [74]. The dimensionality of the 
space being large and numerical errors increasing with the dimensionality of the inte- 
grand, such precise integration schemes are necessary. For an arbitrary real function 
/ to be integrated between the values x\ and X2, this numerical scheme reads: 

/ f(x) dx^—Ax T[l9f(x l + (5i - 5) Ax) + 75/ (xi + (5i - 4) Ax) 
J X1 288 . =i 

+ 50/(xi + (5i - 3)Ax) + 50/ (xi + (5i - 2) Ax) 

+ 75/(jci + (5/ - 1)Ajc) + 19/(jci + 5/ Ax)], 

where Ax is the integration step, and M = (x2 — xi)/Ax is chosen to be an integer 
multiple of 5. 

The discretization of the derivatives with respect to the phase space parameters is 
done through the following fourth-order central difference scheme: 

df(x) ^ f(x - 2Ax) - 8/(x - Ax) + 8/(x + Ax) - fix + 2Ax) 
dx 12 Ax 

for the first-order derivatives, and 

d 2 fjx) 
dx 2 

for the second-order derivatives (see [75]). 



(-fix - 2Ax) + 16 fix - Ax) 

- 30 fix) + 16/(x + Ax) - fix + 2Ax))/(l2Ax 2 ) 
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Finally, we have used a Runge-Kutta method of order 2 (RK2) for the numerical 
integration of the resulting system of ODEs. This method is of the explicit kind for 
ordinary differential equations, and it is described by the following Butcher tableau: 



0 




2/3 


2/3 




1/4 3/4 



4.3 Comparison between the solutions to the network and the mean-field equations 

We illustrate these ideas with the example of a network of 100 FitzHugh-Nagumo 
neurons belonging to one, excitatory, population. We also use chemical synapses with 
the variation of the weights described by (Equation 11). We choose a finite volume, 
outside of which we assume that the probability density function (p.d.f.) is zero. We 
then discretize this volume with nyn w ny points defined by 

def 

n V = (Vmax- Vmin)/AV, 
n w = Omax - WminVAw, 

n y = f (y max - y mi n)/Ay, 

where V m in, ^max, Wmin, w max , Vmin and y max define the volume in which we solve 
the network equations and estimate the histogram defined in the 'Numerical simu- 
lations of the McKean-Vlasov equations' section, while AV, Aw and Ay are the 
quantization steps in each dimension of the phase space. For the simulation of the 
McKean-Vlasov-Fokker-Planck equation, instead, we use Dirichlet boundary condi- 
tions and assume the probability and its partial derivatives to be 0 on the boundary 
and outside the volume. 

In general, the total number of coupled ODEs that we have to solve for the 
McKean-Vlasov-Fokker-Planck equation with the method of lines is the product 
Pnyn w n y (in our case, we chose P = 1). This can become fairly large if we increase 
the precision of the phase space discretization. Moreover, increasing the precision 
of the simulation in the phase space, in order to ensure the numerical stability of 
the method of lines, requires to decrease the time step At used in the RK2 scheme. 
This can strongly impact the efficiency of the numerical method (see the 'Numerical 
simulations with GPUs' section). 

In the simulations shown in the left-hand parts of Figures 4 and 5, we have used 
one population of 100 excitatory FitzHugh-Nagumo neurons connected with chem- 
ical synapses. We performed 10,000 Monte Carlo simulations of the network equa- 
tions (Equation 14) with the Euler-Maruyama method in order to approximate the 
probability density. The model for the time variation of the synaptic weights is the 
simple model. The p.d.f. p(0, V, w, y) of the initial condition is Gaussian and reads 

p(0, V,w,y) 

— (31) 

= \ e HV-Vo) 2 /(2a^)-(w-w 0 ) 2 /(2al 0 )-(y-y 0 ) 2 /(2^ 0 ) 

(27T) 3 / 2 a Vo a wo a yo 
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Fokker-Planck Equation 




Fokker-Planck Equation 




Fokker-Planck Equation 




Fig. 4 Joint probability distribution. (V, w) computed with the Monte Carlo algorithm for the network 
equations (Equation 14) (left) compared with the solution of the McKean-Vlasov-Fokker-Planck equation 
(Equation 24) (right), sampled at four times % n . Parameters are given in Table 1, with a current / = 0.4 
corresponding to a stable limit cycle. Initial conditions (first column of Table 1) are concentrated inside 
this limit cycle. The two distributions are similar and centered around the limit cycle with two peaks (see 
text). 
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Fokker-Planck Equation 




Fokker-Planck Equation 




Fokker-Planck Equation 




Fig. 5 Joint probability distribution. (V, y) computed with the Monte Carlo algorithm for the network 
equations (Equation 14) (left) compared with the solution of the McKean-Vlasov-Fokker-Planck equation 
(Equation 24) (right), sampled at four times t^ n . Parameters are given in Table 1, with a current / = 0.4 
corresponding to a stable limit cycle. Initial conditions (first column of Table 1) are concentrated inside 
this limit cycle. The two distributions are similar and centered around the limit cycle with two peaks (see 
text). 
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Table 1 Parameters used in the simulations of the neural network and for solving the McKean-Vlasov- 
Fokker-Planck equation 



Initial condition 


Phase space 


FitzHugh- 
Nagumo 


Synaptic 
weights 


Synapse 


ffin = [0.5, 1.2, 1.5,2.2], 


^min = — ^ 


a = 0.7 


7=1 


Vrev = 1 


At = 0.01 (mean field), 


Vmax — 3 


6 = 0.8 


07=0.2 


a r = 1 


0.1 (network) 


av = 0.1 


c = 0.08 




= 1 


V 0 = 0.0 


^min = -2 


7 = 0.4 




7m ax — 1 


ory 0 =0.4 


w max — 2 


^ext = 0 




^ = 0.2 


wq = 0.5 


Aw = 0.1 






V T =2 


0wq = 0-4 


Jmin = 0 






r = 0.1 


y 0 = o.3 


ymax — 1 






A = 0.5 


cr yo = 0.05 


Ay = 0.06 









Results are shown in Figures 4 and 5 (see text). 



The parameters are given in the first column of Table 1. In this table, the pa- 
rameter ffin is the time at which we stop the computation of the trajectories in the 
case of the network equations and the computation of the solution of the McKean- 
Vlasov-Fokker-Planck equation in the case of the mean-field equations. The sequence 
[0.5, 1.2, 1.5, 2.2] indicates that we compute the solutions at those four time instants 
corresponding to the four rows of Figures 4 and 5. The phase space has been quan- 
tized with the parameters shown in the second column of the same table to solve 
the McKean-Vlasov-Fokker-Planck equation. This quantization has also been used 
to build the histograms that represent the marginal probability densities with respect 
to the pairs (V,w) and (V,y) of coordinates of the state vector of a particular neu- 
ron. These histograms have then been interpolated to build the surfaces shown in the 
left-hand side of Figures 4 and 5. The parameters of the FitzHugh-Nagumo model 
are the same for each neuron of the population: they are shown in the third column of 
Table 1. 

The parameters for the noisy model of maximum conductances of Equation 1 1 are 
shown in the fourth column of the table. For these values of / and 07 , the probability 
that the maximum conductances change sign is very small. Finally, the parameters 
of the chemical synapses are shown in the sixth column. The parameters V and A 
are those of the x function (Equation 3). The solutions are computed over an interval 
of r nn = 0.5, 1.2, 1.5, 2.2 time units with a time sampling of At = 0.1 for the net- 
work and At = 0.01 for the McKean-Vlasov-Fokker-Planck equation. The rest of the 
parameters are the typical values for the FitzHugh-Nagumo equations. 

The marginals estimated from the trajectories of the network solutions are then 
compared to those obtained from the numerical solution of the McKean-Vlasov- 
Fokker-Planck equation (see Figures 4 and 5 right), using the method of lines ex- 
plained above and starting from the same initial conditions (Equation 31) as the neu- 
ral network. 

We have used the value / = 0.4 for the external current (this value corresponds to 
the existence of a stable limit cycle for the isolated FitzHugh-Nagumo neuron), and 
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Limit Cycle in the V-w plane Limit Cycle in the V-y plane 




Adaptation w 



Fig. 6 Projection of 100 trajectories in the (V, w) (top left), (V, y) (top right) and (w, y) (bottom) planes. 
The limit cycle is especially visible in the (V, w) projection (red curves). The initial conditions split the 
trajectories into two classes corresponding to the two peaks shown in Figures 4 and 5. The parameters are 
the same as those used to generate these two pictures. 



the initial conditions have the values Vo = 0, Wo = 0.5 and y 0 = 0.3; therefore, the 
initial points of the trajectories in the phase space are concentrated inside the limit 
cycle. We therefore expect that the solutions of the neural network and the McKean- 
Vlasov-Fokker-Planck equation will concentrate their mass around the limit cycle. 
This is what is observed in Figures 4 and 5, where the simulation of the neural net- 
work (left-hand side) is in very good agreement with the results of the simulation of 
the McKean-Vlasov-Fokker-Planck equation (right-hand side). Note that the densi- 
ties display two peaks. These two peaks correspond to the fact that depending upon 
the position of the initial condition with respect to the nullclines of the FitzHugh- 
Nagumo equations, the points in the phase space follow two different classes of tra- 
jectories, as shown in Figure 6. The two peaks then rotate along the limit cycle in the 
(V, w) space (see also the 'Numerical simulations with GPUs' section). 

Figures 4 and 5 show a qualitative similarity between the marginal probabil- 
ity density functions obtained by simulating the network and those obtained by 
solving the Fokker-Planck equation corresponding to the mean-field equations. 
To make this more quantitative, we computed the Kullback-Leibler divergence 
^KL^Networkl |/?mvfp) between the two distributions. 

We performed 10,000 Monte Carlo simulations of the network equations up to 
r fin = 10 for increasing values of the network size N. As shown in Figure 7, the 
Kullback-Leibler divergence does decrease with increasing values of N, thereby con- 
firming the fact that even for relatively small values of N, the average behavior of 
the network is well represented by the mean-field system described by the McKean- 
Vlasov-Fokker-Planck equation. 
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Fig. 7 Variation of the 
Kullback-Leibler divergence. 
Variation of the 
Kullback-Leibler divergence 
between the marginal 
probability density function 
p(t, V, w) estimated from the 
network equations and 
computed from the 
McKean-Vlasov-Fokker-Planck 
equation as a function of the 
network size. We have 
performed 10,000 Monte Carlo 
simulations of the network 
equations up to time ^ n = 10.0 



10° 10 1 10 2 10 3 10 4 

Number of neurons in the network 

4.4 Numerical simulations with GPUs 

Unfortunately, the algorithm for solving the McKean-Vlasov-Fokker-Planck equation 
described in the previous section is computationally very expensive. In fact, when 
the number of points in the discretized grid of the (V, w, y) phase space is big, i.e. 
when the discretization steps AV, Aw; and Ay are small, we also need to keep At 
small enough in order to guarantee the stability of the algorithm. This implies that the 
number of equations that must be solved has to be large and moreover that they must 
be solved with a small time step if we want to keep the numerical errors small. This 
will inevitably slow down the simulations. We have dealt with this problem by using 
a more powerful hardware, the graphical processing units (GPUs). 

We have changed the Runge-Kutta scheme of order 2 used for the simulations 
shown in the 'Numerical simulations of the McKean-Vlasov-Fokker-Planck equa- 
tion' section and adopted a more accurate Runge-Kutta scheme of order 4. This was 
done because with the more powerful machine, each computation of the right-hand 
side of the equation is faster, making it possible to use four calls per time step instead 
of two in the previous method. Hence, the parallel hardware allowed us to use a more 
accurate method. 

One of the purposes of the numerical study is to get a feeling for how the different 
parameters, in particular those related to the sources of noise, influence the solutions 
of the McKean-Vlasov-Fokker-Planck equation. This is meant to prepare the ground 
for the study of the bifurcation of these solutions with respect to these parameters, 
as was done in [76] in a different context. For this preliminary study, we varied the 
input current / and the parameter a ex t controlling the intensity of the noise on the 
membrane potential in Equations 14. The McKean-Vlasov-Fokker-Planck equation 
writes in this case: e 



e We have included a small noise (controlled by the parameter a w ) on the adaptation variable w. This does 
not change the previous analysis, in particular proposition 1 , but makes the McKean-Vlasov-Fokker-Planck 
equation well-posed in a cube of the state space with 0 boundary value, see e.g. [82]. 




Springer 



Page 36 of 50 



Baladron et al. 



Table 2 Parameters used in the simulations of the McKean-Vlasov-Fokker-Planck equation on GPUs 



Initial condition 


Phase space 


Stochastic 
FN neuron 


Synaptic 
weights 


At = 0.0025, 0.0012 


V- = -4 
v mm ^ 


a = 0.7 


7 = l 


V 0 = 0.0 


Vmax = 4 


^ = 0.8 


aj =0.01 


a VQ = 0.2 


AV =0.027 


c = 0.08 




Wq = —0.5 


^min = 3 


/ = 0.4, 0.7 




cr WQ = 0.2 


w max — 3 


or ext = 0.27, 0.45 




y 0 = o.3 


Aw = 0.02 


a w = 0.0007 




cr yo = 0.05 


>'min = 0 
>'max = 1 
Ay = 0.003 







The simulations are shown in Figures 8 and 9 and in Additional files 1, 2, 3 and 4. 



dt 



p(t, V, w, y) 



d 

'ay 



v 3 - c 

v w + / - J(v - y rev ) / //?^y^,/Wy'</u/<// 

3 « 3 



x p(t, V, w, y) 
d 



d W 
d 



[c(V +a -bw)p(t, y, w,y)] 



- — {[a r S(V)(l - 30 - V, u;, y)} 



(32) 



+ 



i a 2 



2 ay 2 

x p{t, V, w, y) 



+ crj(V - y re v) 2 ( f y'p{t, V, uA/JrfV'du/rf/ 

VM 3 



2n 



i 2 a 2 



2a^ 



{[a r s(y)(i - y) + w]x 2 oo/^, y, u>, y)}. 



The simulations were run with the x function (Equation 3); the initial condition 
described by Equation 31 and the parameters are shown in Table 2. These parameters 
are similar to those used in the previous numerical simulations, but they differ in the 
size of the grid which is larger in this case. 

Four snapshots of the solution are shown in Figure 8 (corresponding to the values 
/ = 0.4 and a ext = 0.27 of the external input current and of the standard deviation of 
the noise on the membrane potential), and three are shown in Figure 9 (corresponding 
to the values / = 0.7 and <7 ext = 0.45). In the figures, the left column corresponds to 
the values of the marginal p(t, V, w), and the right column corresponds to the values 
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Fig. 8 Marginals of the solutions to the McKean-Vlasov-Fokker-Planck equation. Marginals with respect 
to the V and w variables (left) and to the V and y variables (right) of the solution of the McKean-Vlasov- 
Fokker-Planck equation. The^rsf row shows the initial condition; the second, the marginals at time 30.0; 
the third, the marginals at time 50.0; and the fourth, the stationary (large time) solutions. The input current 
/ is equal to 0.4 and <r ext = 0.27. These are screenshots at different times of movies available as Additional 
files 1 and 2. 
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Fig. 9 Marginals of the solutions to the McKean-Vlasov-Fokker-Planck equation. Marginals with respect 
to the V and w variables (left) and to the V and y variables (right) of the solution of the McKean-Vlasov- 
Fokker-Planck equation. The first row shows the marginals at time 30.0, the second the marginals at time 
50.0 and the third the stationary (large time) solutions. The input current / is equal to 0.7 and cx ext = 0.45. 
These are screenshots at different times of movies available as Additional files 3 and 4. 



of the marginal p(t, V, y). Both are necessary to get an idea of the shape of the full 
distribution p(t, V, w, y). The first row of Figure 8 shows the initial conditions. They 
are the same for the results shown in Figure 9. The second, third and fourth rows of 
Figure 8 show the time instants t = 30.0, t = 50.0 and at convergence (the time units 
differ from those of the previous section, but it is irrelevant to this discussion). The 
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Fokker-Planck Equation Fokker-Planck Equation 




Potential V 1 2 3 -2.0 15 Potential V 2 3 0.0 



Fig. 10 Marginals of the solutions to the McKean-Vlasov-Fokker-Planck equation at convergence. 
Marginals with respect to the V and w variables (left) and to the V and y variables (right) of the so- 
lution of the McKean-Vlasov-Fokker-Planck equation at convergence. The parameters are those in Table 1 
except for the input current / which is equal to —0.8, <x ext = 0.45 and % n = 2.2. Compare with the last 
row of Figure 9 (see text). 

three rows of Figure 9 show the time instants t = 30.0, t = 50.0 and at convergence. 
In both cases, the solution appears to converge to a stationary distribution whose mass 
is distributed over a 'blurred' version of the limit cycle of the isolated neuron. The 
'blurriness' increases with the variance of the noise. The four movies for these two 
cases are available as Additional files 1, 2, 3 and 4. 

The results shown in Figures 8 and 9 and in Additional files 1, 2, 3 and 4 
were obtained using two machines, each with seven nVidia Tesla C2050 cards, six 
2.66 GHz dual-Xeon X5650 processors and 72G of ram. The communication inside 
each machine was done using the lpthreads library and between machines using MPI 
calls. The mean execution time per time step using the parameters already described 
is 0.05 s. 

The reader interested in more details in the numerical implementations and in the 
gains that can be achieved by the use of GPUs can consult [77] . 

In Figure 10, we show a solution to the McKean-Vlasov-Fokker-Planck equation 
which is qualitatively quite different from the solutions shown in Figures 8 and 9: The 
stationary solution is concentrated at a point in (V, w, y) space. This is an indication 
that perhaps, between the values —0.8 and 0.4 of the input current, the solutions 
to the McKean-Vlasov-Fokker-Planck equation have bifurcated. The numerical tools 
we have developed may be a way to build an intuition to guide a rigorous analysis of 
these phenomena. 

5 Discussion and conclusion 

In this article, we addressed the problem of the limit in law of networks of biolog- 
ically inspired neurons as the number of neurons tends to infinity. We emphasized 
the necessity of dealing with biologically inspired models and discussed at length the 
type of models relevant to this study. We chose to address the case conductance-based 
network models that are a relevant description of the neuronal activity. Mathemati- 
cal results on the analysis of these diffusion processes in interaction resulted to the 
replacement of a set of NP d-dimensional coupled equations (the network equa- 
tions) in the limit of large TVs by P d -dimensional mean-field equations describing 
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the global behavior of the network. However, the price to pay for this reduction was 
the fact that the resulting mean-field equations are nonstandard stochastic differential 
equations, similar to the McKean-Vlasov equations. These can be expressed either as 
implicit equations on the law of the solution or, in terms of probability density func- 
tion through the McKean-Vlasov-Fokker-Planck equations, as a nonlinear, non-local 
partial differential equation. These equations are, in general, hard to study theoreti- 
cally. 

Besides the fact that we explicitly model real spiking neurons, the mathematical 
part of our work differs from that of previous authors such as McKean, Tanaka and 
Sznitman (see the 'Introduction' section) because we are considering several popula- 
tions with the effect that the analysis is significantly more complicated. Our hypothe- 
ses are also more general, e.g. the drift and diffusion functions are nontrivial and 
satisfy the general condition (H4) which is more general than the usual linear growth 
condition. Also, they are only assumed locally (and not globally) Lipschitz contin- 
uous to be able to deal, for example, with the FitzHugh-Nagumo model. A locally 
Lipschitz continuous case was recently addressed in a different context for a model 
of swarming in [67] . 

Proofs of our results, for somewhat stronger hypotheses than ours and in special 
cases, are scattered in the literature, as briefly reviewed in the 'Introduction' and 'Set- 
ting of the problem' sections. Our main contribution is that we provide a complete, 
self-sufficient proof in a fairly general case by gathering all the ingredients that are 
required for our neuroscience applications. In particular, the case of the FitzHugh- 
Nagumo model where the drift function does not satisfy the linear growth condition 
involves a generalization of previous works using the more general growth condi- 
tion (H4). 

The simulation of these equations can itself be very costly. We, hence, addressed 
in the 'Numerical simulations' section numerical methods to compute the solutions 
of these equations, in the probabilistic framework, using the convergence result of the 
network equations to the mean-field limit and standard integration methods of differ- 
ential equations or in the Fokker-Planck framework. The simulations performed for 
different values of the external input current parameter and one of the parameters 
controlling the noise allowed us to show that the spatio-temporal shape of the proba- 
bility density function describing the solution of the McKean- Vlasov-Fokker-Planck 
equation was sensitive to the variations of these parameters, as shown in Figures 8 
and 9. However, we did not address the full characterization of the dynamics of the 
solutions in the present article. This appears to be a complex question that will be 
the subject of future work. It is known that for different McKean-Vlasov equations, 
stationary solutions of these equations do not necessarily exist and, when they do, 
are not necessarily unique (see [78]). A very particular case of these equations was 
treated in [76] where the authors consider that the function f a is linear, g a is con- 
stant and b a p(x, y) = Sp(y). This model, known as the firing-rate model, is shown 
in that paper to have the Gaussian solutions when the initial data is Gaussian, and 
the dynamics of the solutions can be exactly reduced to a set of 2P -coupled ordinary 
differential equations governing the mean and the standard deviation of the solution. 
Under these assumptions, a complete study of the solutions is possible, and the de- 
pendence upon the parameters can be understood through bifurcation analysis. The 
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authors show that intrinsic noise levels govern the dynamics, creating or destroying 
fixed points and periodic orbits. 

The mean-field description has also deep theoretical implications in neuroscience. 
Indeed, it points towards the fact that neurons encode their responses to stimuli 
through probability distributions. This type of coding was evoked by several au- 
thors [47], and the mean-field approach shows that under some mild conditions, this 
phenomenon arises: all neurons belonging to a particular population can be seen as 
independent realizations of the same process, governed by the mean-field equation. 
The relevance of this phenomenon is reinforced by the fact that it has recently been 
observed experimentally that neurons had correlation levels significantly below what 
had been previously reported [13]. This independence has deep implications on the 
efficiency of neural coding which the propagation of chaos theory accounts for. To 
illustrate this phenomenon, we have performed the following simulations. Consider- 
ing a network of 2, 10 and 100 FitzHugh-Nagumo neurons, we have simulated 2,000 
times the network equations over some time interval [0, 100]. We have picked at ran- 
dom a pair of neurons and computed the time variation of the cross -correlation of 
the values of their state variables. The results are shown in Figure 1 1. It appears that 
the propagation of chaos is observable for relatively small values of the number of 
neurons in the network, thus indicating once more that the theory developed in this 
paper in the limit case of an infinite number of neurons is quite robust to finite- size 
effects/ 



FitzHugh-Nagumo Network, N = 2, Monte Carlo sims - 2000 FitzHugh-Nagumo Network, N = 10, Monte Carlo sims = 2000 




Time 



Fig. 11 Variations over time of the cross-correlation of (V, w, y) variables of several FitzHugh-Nagumo 
neurons in a network. Top left: 2 neurons. Top right: 10 neurons. Bottom: 100 neurons. The cross-correla- 
tion decreases steadily with the number of neurons in the network. 



Note that we did not estimate the correlation within larger networks since, as predicted by Theorem 4, it 
will be smaller and smaller, requiring an increasingly large number of Monte Carlo simulations. 
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The present study develops theoretical arguments to derive the mean-field equa- 
tions resulting from the activity of large neuron ensembles. However, the rigorous 
and formal approach developed here does not allow direct characterization of brain 
states. The paper, however, opens the way to rigorous analysis of the dynamics of 
large neuron ensembles through derivations of different quantities that may be rele- 
vant. A first approach could be to derive the equations of the successive moments of 
the solutions. Truncating this expansion would yield systems of ordinary differential 
equations that can give approximate information on the solution. However, the choice 
of the number of moments taken into account is still an open question that can raise 
several deep questions [46] . 



Appendix 1: Proof of Theorem 4 

In this appendix, we prove the convergence of the network equations towards the 
mean-field equations (Equation 22) and of the propagation of chaos property. The 
proof follows standard proofs in the domain as generally done, in particular by Tanaka 
or Sznitman [6, 10], adapted to our particular case where we consider a non-zero drift 
function and a time- and space-dependent diffusion function. It is based on the very 
powerful coupling argument, which identifies the almost sure limit of the process X 1 
as the number of neurons tends to infinity, as popularized by Sznitman in [12], but 
whose idea dates back from the 1970s (for instance, Dobrushin uses it in [5]). This 
process is exactly the solution of the mean-field equation driven by the same Brown- 
ian motion as X 1 and with the same initial condition random variable. In our case, this 
leads us to introduce the sequence of independent stochastic processes (X l t )i=\...N 
having the same law as X a , a = p(i), solution of the mean-field equation: 

P 

dX\ = f a (t,X i t )dt + J2 E z [ b <*Y (*t > Z t)] dt 

7=1 p (33) 

+ ga (t,Xi)dW} + J2M^y(X i t ,Zj)]dB i t , 
y=\ 

with initial condition X l 0 = X l Q , the initial condition of the neuron i in the network, 
which was assumed to be independent and identically distributed. (W/) and (Bj) are 
the Brownian motions involved in the network equation (Equation 21). As described 
previously, Z = (Z , . . . , Z p ) is a process independent of X that has the same law. 
Denoting, as described previously, the probability distribution of Xf solution of the 
mean-field equation (Equation 22) by mf , the law of the collection of processes (X l t k ) 
for some fixed k e N*, namely m p ^ (g) • • • (g) m p ^ lk \ is shown to be the limit of the 
process (X\) solution of the network equations (Equation 21) as N goes to infinity. 
We recall, for completeness, Theorem 4: 

Theorem 4 Under assumptions (HI) to (H4), the following holds true: 

• Convergence: For each neuron i of population a, the law of the multidimensional 
process X l,N converges towards the law of the solution of the mean-field equation 
related to population a, namely X a . 
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• Propagation of chaos: For any k e N*, and any k-uplet (zi , . . . , ^ 
process (X l t uN , . . . , Xj"'^, t <T) converges towards mf^ ® • • • ®mf^ ln \ i.e. the 
asymptotic processes have the law of the solution of the mean-field equations and 
are all independent. 



Proof On our way, we also prove that 



Sfsup|xj- 



max AnElsupllXl'^-X' 112 

j=l-JV 



< oo, (34) 



which implies, in particular, convergence in law of the process {X\' N , t <T) towards 
(X", t <T) solution of the mean-field equations (Equation 22). 

The proof basically consists of thoroughly analyzing the difference between the 
two processes as N tends to infinity. The difference is the sum of eight terms (we 
dropped the index TV for the sake of simplicity of notations) denoted by A, through 

X\-X\ = f f a (s, Xl) - f a (s, X' s )ds + f g a (s, X' s ) - g a (s, x' s )dW' s 
Jo Jo 

' , ' 1 , ' 

A, B, 
P f 1 Ny 

+ E/ w J2 b «y( x s' x ')- b «y(^ x ') ds 

~, JO N Y ~, 



y=i" u ' i=\ 



c, 



P f 1 Ny 

+ E/ j r J2 b »y( x s> xJ *)- b «y( x >> xJ *) ds 



Y=l" v Y 7 = 1 



p r< i Ny 

+ E/ i^J2 b «y( x s' X ')-M b -y( x ^ z s)]ds (35) 



E, 



p r< i Ny 

E/ F EAy(^ui)-^K^ s J )^ 



y=l'» r j=l 



p f 1 Ny 

+ E / w Y.^yi x 's' x ^)-^y{K^)dB7 

~, JO Ny f-f 



G, 



E y ^EM*U0- E z[M^ z 0]^ 
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It is important to note that the probability distribution of these terms does not depend 
on the neuron i . We are interested in the limit, as N goes to infinity, of the quantity 
E[sup 5<r || X l s — X l s \\ 2 ]. We decompose this expression into the sum of the eight 
terms involved in Equation 35 using Holder's inequality and upperbound each term 
separately. The terms A t and B t are treated exactly as in the proof of Theorem 2. 
We start by assuming that / and g are uniformly globally K Lipschitz continuous 
with respect to the second variable. The locally Lipschitz case is treated in the same 
manner as done in the proof of Theorem 2 (1) by stopping the process at time tjj, 
(2) by using the Lipschitz continuity of / and g in the ball of radius U and (3) 
by a truncation argument and using the almost sure boundedness of the solutions 
extending the convergence to the locally Lipschitz case. 
As seen previously, we have: 

Ersup||AJ 2 l<^ 2 r f E\su V \\xi-Xi\\ 2 ]ds, 
Er S up||B,|| 2 l <4K 2 fE\saplxl- Xlf]ds. 

Now, for C t , 

p r i Ny 

y=i Jo N * j=i 
f s P 1 Ny 

(Cauchy-Schwarz) < T P / V — V || b ay (X l u , X J U ) - b ay (X' u , X J U ) \\ d 

JO U Y U 

f\\K 

Jo 



(assumption (H2)) < TPL 
Therefore, 



■K\\ du. 



sup||Q|| z <rPL z 

s<t 



' f ixi-xifds, 

Jo 

E[sup ye, || 2 1 £ TPL 2 f Efsupfxi - X\ || 2 1 ds. 

\-s<t J Jo L u<s J 



Similarly, for D t , 



su P ||Z)j z <r 

s<t 



i 

Jo 



1 



J2 b °r(%> X i)- b "Y&>xi) 
y=l 7 j=l 

Ny 



ds 



ft I P ! N r 

(Cauchy-Schwartz) < PT V - V || b ay (X l s , Xj ) - b ay (X l s , X J S ) 
(assumption (H2)) <PTL 2 ( V — Y\\X J S - X J S \\ 2 ) ds. 

JO \ y = 1 N Y j=1 ) 



\ds 
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Hence, we have: 

E 



sup 

s<t 



Ds\\ 2 ] 



\y=l 1 y = l 



Therefore, 



E 



sup||£>s 

s<t 



< P 2 TL 2 



f 

JO J 



max E 

=l-N l u < s 



sup\\x j u - x j u \ 



ds. 



The terms F t and G t are treated in the same fashion, but instead of using the 
Cauchy-Schwartz inequality, the Burkholder-Davis-Gundy martingale moment in- 
equality are used. For F t , in detail, 



L s<t J 



p ft 

(Cauchy-Schwartz) < 4P V / E 
p 



N. 



■£p aY (xi,xi)-i} aY (xi,xi) 



y i- 



7=1 



ds 



(Cauchy-Schwartz) <4PV f YMf/V (T s , X j s ) - p ay (T s , x{) f]ds 
y=i JoN rj=i 

(assumption (H2)) < 4P 2 L 2 j E[ || X l s - X\ || 2 ] efo 

./o 



<4L 2 /> 2 f E[sup \\Xi-Xi || 2 1 
Similarly, for G u we obtain: 



E 



sup || G, 

>-s<t 



|2 1 <4L 2 P [ max E 



# L 0<w<s 



sup 11x2 -Xd II 2 



We are left with the problem of controlling the terms E t and H t that involve sums 
of processes with bounded second moment, thanks to Proposition 3 and assump- 
tion (H3). We have: 



E sup \\E S 

*-s<t 







= E 


sup 




s<t 



K=l Y 7=1 



•Ez[^y(^,Z M )]j M 
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p c r i Ny 

(Cauchy-Schwartz) <TP^ E — b( *v ' ^ ) 

r =l 0 L K ;=1 



-E z [b ay (xi,Z s )] 



ds, 



and using the Burkholder-Davis-Gundy martingale moment inequality, 
E\sup\\H s \\ 2 ]<4PY / E — V^ y (^,x/)-E z [^y(^,Zj)] 
Each of these two expressions involves an expectation which we write: 



2-1 



ds. 



E 



i^©(^,^)-E Z [©(^,zr)] 



where 0Gj^ y ,ft y } and expand as: 
± £ E[(e(^x/)-E z ^ 

All the terms of the sum corresponding to indexes j and k such that the three condi- 
tions j ^i, k and j ^ k are satisfied are null since in that case, X\ , Xj , Xf and 
Zf are independent and have the same law for p(j) = /?(£) = y. g In effect, denoting 
the measure of their common law by m y t , we have: 

E[(e(^,z/)-E z [0(x^zr)]) r (e(x;,^)- Ez [0(xj,zr)])] 

= E[0(x;,Xi) r 0(X^)] 

&(xi,xi) T J ®(xlz)m Y s (dz) 

j ®{X i s ,z) T m v s {dz)®{X i s ,X , l) 

J ©(X^zfmUdz) j ®(xlz)m Y s (dz) , 
expanding further and renaming the second z variable to y in the last term, we obtain: 

0(x, y) T ®(x, z)m Y s (dx)m Y (dy)m Y (dz) 
j j ®(x,y) T j ®(x,z)m Y (dz)m Y (dx)m Y (dy) 



— E 
-E 
+ E 



///' 



g Note that i ^ j and i ^ k as soon as p(i) ^ p(j) = p(k) = y. In the case where = y, it is easy to 
check that when j (respectively, k) is equal to i , all terms such that k ^ j (respectively, j / k) are equal 
to 0. 
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-j j j ®(x,zfm Y (dz)®(x,y)m Y (dx)m Y (dy) 

+ j j ®(x,zfm%(dz) j ®(x,y)m Y s (dy)m Y s (dx) 

which is indeed equal to 0 by the Fubini theorem. 

Therefore, there are no more than 3N y non-null terms in the sum, and all the 
terms have the same value (that depends on 0), which is bounded by Lemma 3 and 
assumption (H3). We denote the supremum of these 2P 2 values for 0 e {b ay , /3 ay ] 
across all possible pairs of populations by C/3 , and the smallest value of the N y ,y = 
1 • • P by AWi- We have shown that 



E 

Finally, we have: 



sup || E s || 2 



4CTP 2 

and E|sup||//J z | < 



[sup || H s \\ 2 ] 



max E [sup II X\ - X\ II 2 ] < K x [ max E [sup II X J U - X J U II 2 ] du + , 

i=h-N l s <t U SU J Jo j=l-N \-u<s U " -I A^min 

for some positive constants K\ and K2. Using Gron wall's inequality, we obtain: 

max e[sup||^ - Xt f] < — ^ (36) 

i = l... N l s <V l 5 " J-^Vmin 



for some positive constant K^. The right-hand side of this inequality tends to zero 
as N goes to infinity proving the propagation of chaos property. In order to show a 
convergence with speed 1 / '\fN as stated in the theorem, we use the fact: 



max AfE 

i=l-N 



sup||^-z;|| 2 l<^/ 

s<T J 1 



Nrr 

and the right-hand side of the inequality is bounded for all Ns because of the hypoth- 

Na 
N 



esis lim^^oo ^ = c a e (0, 1) for a = 1 • • • P. This ends the proof. □ 
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